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Abstract 

We present an achievable rate for general Gaussian relay networks. We show that the achievable rate is within 
a constant number of bits from the information-theoretic cut-set upper bound on the capacity of these networks. 
This constant depends on the topology of the network, but not the values of the channel gains. Therefore, we 
uniformly characterize the capacity of Gaussian relay networks within a constant number of bits, for all channel 
parameters. 
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I. Introduction 



Characterizing the capacity of wireless relay networks has been a challenging problem over the past 
couple of decades. Although, many communication schemes have been developed [6] -[10], the capacity of 
even the simplest Gaussian relay network: single source, single destination, single relay, is still unknown. 
^ ■ In general, the only known upper bound on the capacity of Gaussian relay networks is the information 
theoretic cut-set upper bound which is not achieved by any of those schemes, not even for a fixed realization 
[ of the channel gains. Furthermore, in a general network with a wide range of channel parameters, the gap 
> ; between those achievable rates and the cut-set upper bound is unclear. As a result, we do not even have 
^ ■ a good approximation of the capacity with an explicit guarantee. 

i/-) , In this paper we introduce a simple coding strategy for general Gaussian relay networks. In this scheme 
CO ; each relay first quantizes the received signal at the noise level, then randomly maps it to a Gaussian 
CN ' codeword and transmits it. We show that we can achieve a rate that is guranteed to be within a constant 
^ ■ gap from the cutset bound. This constant depends on the topological parameters of the network (number 
O ! of nodes in the network), but not on the values of the channel gains. Therefore, we get a uniformly 
j>! | good approximation of the capacity of Gaussian relay networks, uniform over all values of the channel 
gains, thus particularly good approx at high SNR. The presented scheme has close connections to the 
random coding scheme introduced in [2] to achieve the capacity of wireline networks. It has also some 
5^ ! connections with the compress, hash, and forward protocol described in [8], except here the destination 
is not required to decode the quantized signals at the relays. 

The ideas for the main approximation result were inspired by the insight obtained by analyzing 
deterministic relay networks (see [5]). The deterministic approach was motivated by the development 
of the linear deterministic model (see [3], [4]), which was seen to capture the key features of wireless 
channels. We developed some of the connections between the linear deterministic relay network and the 
Gaussian relay network in [4]. 

II. Problem statement and main results 

Consider a network represented by a directed relay network Q = (V, 8) where V is the set of vertices 
representing the communication nodes in the relay network, and 8 is the set of edges between nodes. The 
communication problem considered is unicast. Therefore a special node S G V is considered the source 



of the message and a special node D e V is the intended destination. All other nodes in the network 
facilitate communication between S and D. The received signal jjj at node j G V and time t is given by 

yW = £ + *, W (1) 

where each /i^ is a complex number representing the channel gain from node % to node j, and A/} is the 
set of nodes that are neighbors of j in Q. Furthermore, we assume there is an average power constraint 
equal to 1 at each transmitter. Also Zj, representing the channel noise, is modeled as as complex normal 
(Gaussian) random variable 

Zj ~ CM(0, 1) (2) 

For any relay network, there is a natural information-theoretic cut-set bound [11], which upperbounds 
the reliable transmission rate R: 

R < C= max min I(Y Qc ; X n \X n c) (3) 

where = {Q : S £ £1, D E f2 c } is all source-destination cuts (partitions). 
The following is our main result 

Theorem 2.1: Given a Gaussian relay network, Q = (V, £), we can achieve all rates R up to C — K. 
Therefore the capacity of this network satisfies 

C - k < C < C (4) 

Where C is the cut-set upper bound on the capacity of Q as described in equation ©, and k is a constant 
and is upper bounded by 5\V\, where \V\ is the total number of nodes in Q. 

The gap (k) holds for all values of the channel gains and is relevant particularly when the SNR is 
high and the capacity is large. While it is possible to improve k further, in this paper we focus to prove 
such a constant, depending only on the topology of Q but not the channel parameters, exists in general. 
This constant gap result is a far stronger result than the degree of freedom result, not only because it is 
non-asymptotic but also because it is uniform in the many channel SNR's. This is also the first constant 
gap approximation of the capacity of Gaussian relay networks. As we will discuss in the next section, 
the gap between the achievable rate of other well known relaying schemes and the cut-set upper bound 
in general depends on the channel parameters and can become arbitrarily large. 

A. Examples 

In this section we use a few examples to show that the gap between the achievable rate of other relaying 
schemes and the cut-set upper bound depends on the channel parameters and can become arbitrarily large. 
In particular we focus on three well known strategies: amplify-forward, decode-forward, and compress- 
forward. 

1) Amplify-forward strategy: Consider the diamond network with real channel gains shown in figure 
[TJa). Assume a is a large real number. The cut-set upper bound is approximately, 

C w 5 log a (5) 

Now consider an amplify-forward strategy in which nodes A 1 and A 2 amplify the received signal by 
«i and « 2 and forward them to the destination. Then assuming that x was transmitted at the source, the 
received signal at the destination will be 



Ud = a s ai (a 5 x + z^) + fl 5 «2 {& 2 % + z a 2 ) + z d 



(6) 



where za 1 , za 2 and zq are Gaussian noises with variance 1 and x is the transmitted signal with average 
power constraint equal to 1. To satisfy the average transmit power constraint at A\ and A 2 , for large 
values of a we should have 



Ct\ < 



«2 < 



1 



(7) 



Now since © is just like a point to point channel from S to D, the achievable rate of amplify-forward 
strategy will approximately be 
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(8) 
(9) 
(10) 



Now by comparing (flOl) and © we note that as a increases the gap between the achievable rate of 
amplify-forward strategy and the cut-set upper bound increases. Now by theorem 13.71 in section IIII-CL 
which is a special case of our main theorem 12.11 for multi-stage networks, the achievable rate of the 
relaying strategy proposed in this paper is within |l2 = 6 bits of the cut-set upper bound of this network 
for all channel parameters^. 





(b) 




(c) 

Fig. 1. Diamond network is shown in (a). A two layer network is shown in (b). The effective network for compress-forward strategy is 
shown in (c). 

2) Decode-forward strategy: Consider the same example as shown in figure [It a). Now it is easy to 
show that the achievable rate of the decode-forward strategy is upper bounded by 



Rdf < 3 log a 



(11) 



Therefore, as a gets larger, the gap between the achievable rate of decode-forward strategy and the cut-set 
upper bound © increases. 



factor of i comes from the fact that here we are dealing with channels with real valued gains 



3) Compress-forward strategy: Consider the example shown in figure QIb). For large values of a, cut-set 
upper bound on the capacity of this relay network is approximately 

(7^5 log a (12) 

Now consider the compress-forward strategy as described in [10] section V. The achievable rate of 
this scheme is characterized in Theorem 3 ([10] page 9), which is in the form of a mutual information 
maximization over auxiliary random variables Ur and Yr- Even though this is written in single-letter 
form, since there is no cardinality bounds, the rate optimization is still an infinite dimensional optimization 
problem. However, to simplify this problem further, assume that auxiliary random variables Ur are set to 
zero, and Y? are restricted to have a Gaussian distribution, which leads to a finite dimensional problem. 

The scheme is such that the Wyner— Ziv source-coding region of each layer must intersect the channel- 
coding region of the next layer. As a result by looking at layer {Bi, B 2 } we note that node B 1 should 
compress its received signal to a Gaussian random variable with variance a 2 . In another words, just quantize 
the received signal with distortion a. Therefore the effective network will look like the one shown in figure 
[H(c). Note that now the cut-set upper bound of this network is approximately, C ~ 4 log a. 

As a result, with this compress-forward scheme, it is not possible to get a rate more than 4 log a. 
As a increases the gap between the achievable rate of compress-forward strategy and the cut-set upper 
bound increases. Now by Theorem 13.71 in section IIII-CL which is a special case of our main Theorem 
12. II for multi-stage networks, the achievable rate of the relaying strategy proposed in this paper is within 
| x 18 = 9 bits of the cut-set upper bound of this network for all channel parameters. 

B. Proof Strategy 

Theorem 12.11 is the main result of the paper and the rest of the paper is devoted to sketch its proof. For 
details of the proof, the reader is referred to [1]. First we focus on networks that have a layered structure, 
i.e. all paths from the source to the destination have equal lengths. With this special structure we get a 
major simplification: a sequence of messages can each be encoded into a block of symbols and the blocks 
do not interact with each other as they pass through the relay nodes in the network. The proof of the 
result for layered network is done in section [nil Second, we extend the result to an arbitrary network by 
considering its time-expanded representation. This is done in section HvlFI . The time-expanded network is 
layered and we can apply our result in the first step to it. To complete the proof of the result, we need 
to establish a connection between the cut values of the time-expanded network and those of the original 
network. We do this using sub-modularity properties of entropy function. 

III. Layered networks 

In this section we prove main theorem 12.11 for a special case of layered networks, where all paths from 
the source to the destination in Q have equal length. In a layered network, for each node j we have a 
length lj from the source and all the incoming signals to node j are from nodes i whose distance from 
the source are ^ = lj — 1. Therefore, as in the example network of Figure |2j we see that there is message 
synchronization, i.e., all signals arriving at node j are encoding the same sub-message. 

Suppose message wt is sent by the source in block k, then since each relay j operates only on block 
of lengths T, the signals received at block k at any relay pertain to only message Wh-ij where lj is the 
path length from source to relay j. To explicitly indicate this we denote by (wk-ij) as the received 
signal at block k at node j. We also denote the transmitted signal at block k as xj. (wk-i-ij) . 

2 The concept of time-expanded representation is also used in [2], but the use there is to handle cycles. Our main use is to handle interaction 
between messages transmitted at different times, an issue that only arises when there is interference at nodes. 



A. Encoding 

We have a single source S with message W 6 {1, 2, ... , 2 flT } which is encoded by the source S into 
a signal over T transmission times (symbols), giving an overall transmission rate of R. 

Each relay operates over blocks of time T symbols. In particular block k of T received symbols 
at node i is denoted by y^ = {y\ T , ■ ■ ■ , 2/^ } and the transmit symbols by xf^. Now the 
achievability strategy is the following: each received sequence at node % is quantized into yf^ which 
is then randomly mapped into a Gaussian codeword x^ using a random (binning) function fi{y\ ). For 
quantization, we use a Gaussian vector quantizer. 

Since we have a layered network, without loss of generality consider the message w — w\ transmitted 
by the source at block k — 1. At node j the signals pertaining to this message are received by the relays 
at block lj. Given the knowledge of all the encoding functions at the relays and signals received at block 
Id, the decoder D, attempts to decode the message W by finding the message that is jointly typical with 
its observations. 

B. Proof illustration 

Consider the encoding-decoding strategy as described in section IIII-AI Our goal is to show that, 
using this strategy, all rates described in the theorem are achievable. The method we use is based on 
a distinguishability argument. This argument was used in [2] in the case of wireline networks. In [5], 
we used similar arguments to characterize the capacity of a general class of linear deterministic relay 
networks with broadcast and multiple access. The main idea behind this approach is the following: due to 
the deterministic nature of these channels, each message is mapped to a deterministic sequence of transmit 
codewords through the network. The destination can not distinguish between two messages if and only 
if its received signal under these two messages are identical. If so, there would be a partition of nodes 
in the network such that the nodes on one side of the cut can distinguish between these two messages 
and the rest can not. This naturally corresponds to a cut separating the source and the destination in the 
network and the probability that this happens can be related to the cut-value. This is the main tool that 
we used in [5] to show that thecut-set upper bound can actually be achieved. 

However, in the noisy case, the difference from the previous analyses is that each message is potentially 
mapped to a set of possible transmit sequences. The particular transmit sequence chosen depends on the 
noise realization, which can be considered "typical". Pictorially it means that there is some fuzziness 
around the sequence of transmit codewords associated with each message. Hence, two messages will still 
be distinguishable at the destination if the fuzzy received signal associated with them are not overlapping. 
This intuitively means that if we can somehow bound this randomness, a communicate rate close to the 
cut-set bound is achievable. 

In order to illustrate the proof ideas of Theorem (|2.1I) we examine the network shown in Figure |2] 
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Fig. 2. An example of a layered Gaussian relay netowrk. 



Assume a message w is transmitted by the source. Once the destination receives y D , quantizes it to 
get y D . Then, it will decode the message by finding the unique message that is jointly typical with y D 



(the precise definition of typicality will be given later). An error occurs if either w is not jointly typical 
with y D or there is another message w' such that y D is jointly typical with both w, w'. 

Now for the relay network, a natural way to define whether a message w is typical with a received 
sequence is whether we have a "plausible" transmit sequenced under w which is jointly typical with the 
received sequence. More formally, we have the following definitions. 

Definition 3.1: For a message w, we define the set of received sequences that are typical with the 
message as, 

y i (w) = {y i :(y i ,w)eT 6 }, (13) 

where we still need to define what we mean by (y i; w ) £ T$. 

Definition 3.2: For a message w, we define the set of transmitted sequences that are typical with the 
message as, 

Xi(w) = {x, : x, = fiiy^Yi £ yi(w)}, (14) 
which defines the "typical" transmit set associated with a message w. 

Note here that since x, = /i(y 4 ), then naturally (x^yj £ T§. This leads us to the following definition, 
Definition 3.3: We define (y^w) £ T$ if 

(y is {xj} jeI n(i)) e T 5 for some x,- £ Xj(w), Vj £ Jn(z) (15) 

where is defined as the set of nodes with signals incident on node i. 

Therefore by this definition, if a message w is typical with a received sequence, we have a sequence 
of typical transmit sequences in the network that are jointly typical with the w and the received sequence 
at the destination. 

Now note the following important observation, 

Observation: Note that if node i cannot distinguish between two messages w, w', this means that 
the signal received at node i, y i is such that (y 4 , w) £ T$ and (y i; w') £ T$. Therefore we see that 

heyMnyiiw'). (16) 

Due to the mapping x, = fi(y,i), we therefore see that Xj £ Xi(w) PI Xi(w'). Therefore, there exists a 
sequence under w' which is the same as that transmitted under w and could therefore have been potentially 
transmitted under w'. 

Now, assuming a message w is transmitted by the source, an error occurs at the destination if either 
w is not jointly typical with y D , or there is another message w' such that y D is jointly typical with 
both w,w'. By the law of large numbers, the probability of the first event becomes arbitrarily small as 
communication block length, T, goes to infinity. So we just need to analyze the probability of the second 
event. To do so, we evaluate the probability that y^ is jointly typical with both w and w', where w' is 
another message independent of w. Then we use union bound over all w/'s to bound the probability of 
the second event. 

Based on our earlier observation, if y D is jointly typical with w, w', then there must be a typical transmit 
sequence x^ = (x.' s ,x' Ai ,x' A ,x' Bi ,x' B ) under w' such that, (Yo,Xg ,Xg ) £ T$. This means that the 
destination thinks this is a plausible sequence. Now for any such sequence there is a natural cut, f2, in Q 
such that the nodes on the right hand side of the cut (i.e. in f2) can tell Xy is not a plausible sequence, 
and those on the left hand side of the cut (i.e. in f2 c ) can not. Clearly this cut is a source-destination 
partition. 



'Plausibility essentially means that the transmit sequence is a member of the typical set of possible transmit sequences under w. 



For now, assume that the cut is f2 = {S, A±, Bi}, as shown in figure [2] Since A 2 , B 2 and D think x v 
is a plausible sequence, we have 

(Y A2 ,x' 5 ) G T 5 (17) 

(Y^x^x^) G T 5 (18) 

(Y D ,x' Bi ,x' B2 ) G T s (19) 

For any such sequence x v , since w is independent of w', we have 

p{(Y A2 ,x^) G T,} < 2- T7 ^' y ^) (20) 

Now, for the layer [A\, A 2 ), we condition on a particular sequence x A2 to have been transmitted by A 2 . 
If x^ 2 = x_a 2 , since is chosen independent of x^j we have, 



P{(Y B2 ,x' Al ,x^) G Ts) < 2 -"(^2^xl^ 2 ) ) ( 2i) 
and similarly If x^ 2 ^ xa 2 , since x Ai , x' Aa are chosen independent of x^, x^ 2 we have, 

f{(Y B21 x' Ai1 4)6T s } < 2 - ti ^a 1 ,x A2 ) (22) 

< 2 -ti(y B2 ;X Ai \x A2 ) Q3) 

Therefore in any case, 

P{(Y B21 x^4) G Ts] < 2 - TI ^ 2 ;X Al \x A2 )^ (24) 

Similarly we can show that, 

p{(Y D ,x^,x^ 2 ) G Ts] < 2- TI ^' x ^ x ^), (25) 
Therefore for any typical sequence x v , the probability that ([T7T)-(fT9l are satisfied is upper bounded by 

2 -TI(X s ;Y A2 ) x 2 -TI(Y B2 ;X Al \X A2 ) x 2 ~TI(Y D ;X Bl \Xb 2 ) 
— 2 -TI(X n ;Y n c\Xnc) (26) 

Now, by using the union bound over all possible x' v 's and cuts, the probability of confusing w with w' 
can be bounded by 

¥{w — w'} < \X v (w')\ ^ 2 - T/ ( Xn; ^ c|x " c ) (27) 

a 

In the next section, we make these arguments precise, and by bounding |Ay(w/)| we prove our main 
theorem 12.11 for networks with a layered structure. 

C. Proof for layered networks 

In this section we extend the idea from section UlI-BI and analyze a /^-layer network, Q. 

Based on the proof strategy illustrated in section IIII-BL we proceed with the error probability analysis 
of our scheme that was described in section IIII-AI Assume message w is being transmitted. To bound the 
probability of error, we just need to analyze the probability that y D is jointly typical with both w, w', for 
a message w' independent of w. We denote this event by w — > w'. 

If y D is jointly typical with w', then there must be a typical transmit sequence x v G X\>(w') under w' 
such that (Yp, x' ) G Ts, where 7z D -i is the set of nodes at layer Id — 1 of the network. This means 
that the destination thinks this is a plausible sequence. Therefore, there is a natural source-destinationcut, 



Q, in Q such that the nodes on the right hand side of the cut {i.e. in Q) can tell x v is not a plausible 
sequence, and those on the left hand side of the cut (i.e. in f2 c ) can not. Note that due to the layered 
structure of the network, for any such cut, Cl, we can create d = Id disjoint sub-networks of nodes 
corresponding to each layer of the network, with /3j_i(fi) nodes at distance I — 1 from S that are in 0, 
on one side and (3i{Vt c ) nodes at distance / from S that are in Vt c , on the other, for I = 1, ... Jo- Hence, 
by definition we have 

( Y ft(!2 c )' x A-i(SJ)' x A-iP c )) e ^' l = l,...J D (28) 
Therefore, similar to the pairwise error analysis done in section IIII-BL we can show 

¥{w -> w'} < \X v (w')\J2 2 ~ TI ( Xn ' ,fnclX ^ ( 29 ) 

n 

As the last ingredient of the proof, we state the following lemma which is proved in the appendix. 
Lemma 3.4: Consider a layered Gaussian relay network, Q, then, 

\X v (w')\<2 TK1 (30) 

where K\ = |V| is a constant depending on the total number of nodes in Q. 

A 
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Fig. 3. An example of a general Gaussian network with un equal paths from S to D is shown in (a). The corresponding unfolded network 
is shown in (b). An example of steady cuts and wiggling cuts are respectively shown in (b) by solid and dotted lines. 

Therefore, by ( |29b and lemma l3~4l we have the following, 

Lemma 3.5: Given a Gaussian relay network Q with a layered structure, all rates R satisfying the 
following condition are achievable, 

R < min J(Y n c; *n|*n<0 - «i (31) 

where X it i E V, are iid with complex normal (Gaussian) distribution, and k\ = |V| is a constant depending 
on the total number of nodes in Q. 



To prove our main theorem 12.11 for layered networks, we state the following lemma which is proved 
in the appendix, 

Lemma 3.6: Given a Gaussian relay network Q, then 

C - min I(Y QC ; X n \X QC ) < k 2 (32) 

where Xi, i E V, are iid with complex normal (Gaussian) distribution, C is the cut-set upper bound on 
the capacity of Q as described in equation ©, and k 2 = 2|V|. 

Now by lemma 13.51 and lemma 13.61 we have the following main result 

Theorem 3.7: Given a Gaussian relay network Q with a layered structure, all rates R satisfying the 
following condition are achievable, 

R < C — K-hay (33) 

where C is the cut-set upper bound on the capacity of Q as described in equation ©, and K Lay = k\ + k 2 = 
3|V| is a constant depending on the total number of nodes in Q (denoted by |V|). 

IV. Proof for general networks 

Given the proof for layered networks with equal path lengths, we are ready to tackle the proof of 
Theorem 12. II for general Gaussian relay networks. 

The ingredients are developed below. First is that any Gaussian network can be unfolded over time to 
create a layered Gaussian network (this idea was introduced for graphs in [2] to handle cycles in a graph). 
The idea is to unfold the network to K stages such that i-th stage is representing what happens in the 
network during (i — 1)T to iT — 1 symbol times. For example in figure |3£a) a network with unequal paths 
from S to D is shown. Figure (3^b) shows the unfolded form of this network. As we notice each node 
V E V is appearing at stage 1 < % < K as V[i\. Now we state the following lemma which is a corollary 
of Theorem 13.71 

Lemma 4.1: Given a Gaussian relay network, Q, all rates R satisfying the following condition are 
achievable, 

i?<4 min I(Y Q c jXnjXnc ( ) - «i (34) 

where is the time expanded graph associated with Q, random variables {Xi[t}}i< t < K , i E V are iid 
with complex normal (Gaussian) distribution, and K\ = 3|V|. 

Proof: By unfolding Q we get an acyclic network such that all the paths from the source to the 
destination have equal length. Therefore, by theorem 13.71 all rates i? un f , satisfying the following condition 
are achievable in the time-expanded graph 

R ani < min I(Y n c ■ Xn un{ \X n c ) - K unf (35) 

where {Xi\t\}i< t <K, i G V are iid with complex normal (Gaussian) distribution, and K un f = K\V\ log 4?]. 
Since it takes K steps to translate and achievable scheme in the time-expanded graph to an achievable 
scheme in the original graph, and Ki = -^K unf = |V| log 477, then the Lemma is proved. ■ 
Note that the general achievability scheme that we use here is similar to the one described in section 
IIII-AI for layered networks, except now the message W E {1, 2, . . . , 2 KRT } is encoded by the source S 
into a signal over KT transmission times (symbols). Still, each relay operates over blocks of time T 

(k) (k) 

symbols. In particular each received sequence at node i is quantized into y • which is then randomly 
mapped into a Gaussian codeword using a random (binning) function /i(y-^). Given the knowledge 
of all the encoding functions at the relays and signals received over K + \ V\ — 2 blocks, the decoder D, 
attempts to decode the message W sent by the source. 



If we look at different cuts in the time-expanded graph we notice that there are two types of cuts. 
One type separates the nodes at different stages identically. An example of such a steady cut is drawn 
with solid line in figure [3] (b). However there is another type of cut which does not behave identically at 
different stages. An example of such a wiggling cut is drawn with dotted line in figure [3] (b). There is no 
correspondence between these cuts and the cuts in the original network. 

Now comparing Lemma 14.11 to the main Theorem 12.11 we want to prove, we notice that in this lemma 
the achievable rate is found by taking the minimum of cut-values over all cuts in the time-expanded 
graph (steady and wiggling ones as shown in figure [3]). However in theorem 12.11 we want to prove that 
we can achieve a rate by taking the minimum of cut-values over only the cuts in the original graph or 
similarly over the steady cuts in the time-expanded network. In the following lemma, which is proved in 
the appendix, we show that asymptotically as K — »■ oo this difference (normalized by 1/K) vanishes. 

Lemma 4.2: Consider a Gaussian relay network, Q. Then for any cut f2 un f on the unfolded graph we 
have, 

(K-L + l) min I{Y^ X a \X&) < I(Y Q c- X n jX n? J (36) 



D 



where L = 2l v l~ 2 , X ieV are iid with complex normal (Gaussian) distribution, and {Xi[t}}i< t < K , i E V are 
also iid with complex normal (Gaussian) distribution. 

Hence, by lemma 14.11 and lemma 14.21 we have the following lemma, 

Lemma 4.3: Given a Gaussian relay network Q, all rates R satisfying the following condition are 
achievable, 

R < min /(y nc ; X n \X Q c) - Kl (37) 

where Xi, i £ V, are i.i.d. with complex normal (Gaussian) distribution, and K\ = 3|V|. 
Now by lemma 13.61 we know that, 

C- mmI(Y n c;X Q \X n o) < C - min I(Y Q o; X n \X Q o) 

< 2|V| (38) 

where Xi, i G V, are iid with complex normal (Gaussian) distribution. 

Therefore, by lemma 1431 and inequality (1381) all rates up to C — |V|(3 + 2) = C — 5|V| are achieved 
and the proof of our main theorem 12.11 is complete. 
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Appendix I 

PROOF OF BEAM-FORMING LEMMA 
We know that the capacity of a r x t MIMO channel H, with water filling is 



i=l 

where n = min(r,t), and A.;'s are the singular values of H and Qa is given by water filling solution satisfying 

n 

^ Qu = nP 



i=l 



With equal power allocation 



Now note that 



C w f Cep 



C ep = ^log(l + PA 4 ) 

i=l 



log 



< log 



rr=i(i+^) 
nr=i(i+Q«A 2 



log n 
log ( n 



n i= i max(l,PAj 

1 + Qii^i ] 



/A, 



L max(l,PAj) J 
1 

L I max( 1 , PXi ) max( 1 , PXi 



Now note that 



1=1 



and therefore by arithmetic mean-geometric mean inequality we have 



n i+% 



i=l 



< 



= 2 r 



and hence 



Hence, 



Cep Cwf ^ n 



C< min I(Ync;X n \Xno) + \V\ 

f2gA D 



(39) 

(40) 
(41) 

(42) 
(43) 
(44) 
(45) 
(46) 
(47) 

(48) 

(49) 

(50) 
(51) 



where X„ ieV, are restricted to be iid with complex normal (Gaussian) distribution. 

Next note that Y is obtained by quantizing Y at the noise level. The effect of quantization noise can be compensated by 
adding a factor of two more power at each transmitter. Therefore, for each cut il we have 

I(Ync]X n \Xnc) < I (Y n , ; X a \X n .) + |V|log2 (52) 

where Xi, i £ V, are restricted to be iid with complex normal (Gaussian) distribution. Now by (IBTb and ( 1521 . the lemma is 
proved. 

Appendix II 

PROOF OF LEMMA [374] 

Assume message w' is transmitted. Consider a relay, R, at the first layer. Then, the total number of quantized outputs at R 
would be 

2#(Yb|X s ) _ 2 ti (Yr;Yr\X s ) (53) 

Since we are using an optimal Gaussian vector quantizer at the noise level (i.e. with distortion 1), we can write 

Y R = aY R + N, (54) 
where N ~ CA/"(0,cr^) is a complex Gaussian noise independent of Y R and 

a = ° Y ~ 1 , a 2 N = (1 - a 2 )al - 1 (55) 

Hence 

I(Y R ;Y R \X S ) - l 0g Yl + ^) (56) 

= log(l + a)<l (57) 

Hence the list size of R would be smaller than 2 T . Now the list of typical transmit sequences can be viewed as a tree such 
that at each node, due to the noise, each path will be branched to at most 2 T other typical possibilities. Therefore, the total 
number of typical transmit sequences would be smaller than the product of the expansion coefficient (i.e. 2 T ) over all nodes 
in the graph. Or, more precisely 

\og{\X v (w')\) = \og(\y v {w')\) (58) 

Id 

= H{Y v \w') = Y J H{Y ll \Y ll _ 1 ) (59) 

i=i 

Id 

= ^ J ff(Y 7l |X 7l _ 1 ) (60) 
i=i 

Id 

<$>N =T\V\ (61) 

1=1 

Where 7; is the set of nodes at the Z-th layer of the network. Hence, 

|*vK)l<2 T|v| (62) 

and the proof is complete. 



Appendix III 

PROOF OF LEMMA [472] 
First, we prove a lemma which is a slight generalization of lemma 6.4 in [5], 

Lemma 3.1: Let Vi, ■ ■ ■ , Vi be I non identical subsets of V — {S} such that D G Vi for all 1 < i < I. Also assume a 
product distribution on continuous random variables Xi, i G V. Then 

fc(Yv 2 l-XTvx ) + ••• + KY Vl \X Vl _ x ) + h(Y Vl \X Vl ) > J2 H{Y % \X % ) (63) 

!=1 

where for k = 1, . . . , I, 

v k = |J (V 4l n---nv ik ) (64) 

{%!,.. .,i k }C{l,...,l] 

j is the union of ( ^ 

Proof: First note that 

h(y V2 |x Vl ) + • • ■ + /i(*V, \x Vl _ 1 ) + h(Y Vl \x Vl ) 



or in another words each Vj is the union of ( .) sets such that each set is intersect of j of Vj's. 



h(Y V2 , X Vl ) + ■ ■ ■ + h(Y Vl , x Vi _j + h(y Vl ,x Vl )-J2 KX V( ) 



and 



Now define the set 

where Vo = V;. 

It is easy to show that, 



5>(Yj>JXv,) = E^'^-E^Vi) (65) 

2— 1 ?' — 1 I— 1 

Wi = {Y Vi ,X Vi _J, i=l,...,l (66) 



i=l i=l 

Therefore, we just need to prove that 

a i 

^W^^/vJ (68) 

j=i i=i 

Now, since the differential entropy function is a submodular function we have, 

i i 

^/i(Wi)>^/i(Wi) (69) 

i=l i=l 

where 

w r = (w.n.-nwj, r = i,...,z (70) 

{i 1 ,...,i r }C{l,...,i} 

Now for any r (1 < r < i) we have 

{»!,.. .,i r }C{l,...,J} 

|J ({iV 4l ,^v 4l _ 1 }n.--n{yv <r Xv 4p _ 1 }) 

{i 1 ,...,» r }C{l,...,i} 

= IJ ({ Y v il n-nVi r ,Xv iil _ 1) n-nx Vi . r _ 1) }) 

{<i,...,v}C{l,...,J} 

= { y U { , 1 ,...,, r) (V, 1 n...nv !r )> 1 U( 11 wCV^-Dn-nv^-i))} 



Therefore by equation d69l we have, 



/ i 

^ (71) 

i=l i=l 
2 

= ( 72 ) 

i=l 

Hence the Lemma is proved. 

■ 

Now we are ready to prove lemma |4~2l First note that any cut in the unfolded graph, un f, partitions the nodes at each stage 
1 < i < K to Ui (on the left of the cut) and Vi (on the right of the cut). If at one stage S[i] G Vi or D[i] G Ui then the cut 
passes through one of the infinite capacity edges (capacity Kq) and hence the lemma is obviously proved. Therefore without 
loss of generality assume that S[i] G Ui and D[i] G Vi for all 1 < i < K. Now since for each i £ V, {xi[t]}i<t<K are i.i.d 
distributed we can write 



K-l 



I(Yn L ;X n jXn L .) = £ I{Y Vi+1 ;X Ui \X Vi ) (73) 



! = 1 



Consider the sequence of Vi's. Note that there are total of L = 2' v l 2 possible subsets of V that contain D but not S. 
Assume that V s is the first set that is revisited. Assume that it is revisited at step V s +i- We have, 

S + l-l 3 + 1-1 

]T I(Y Vi+1 ; X Ui \X Vi ) = ]T h(Y Vt+1 \X Vi ) - h(Y Vi+1 \X V( , X u , ) (74) 



Now by Lemma [3TT1 we have 

s+l-l 



J2 HY Vi+1 \X Vi ) >J2h(Y^\X^) (75) 

i—s i—1 

where V^'s are as described in lemma [3~T| Next, note that h{Y\i i+1 X^) is just the entropy of channel noises, and since 
for any v G V we have 

\{i\v£V i }\ = \{j\veV j }\ (76) 



we get 

s+i-l 



J2 h(Y Vi+1 \X Vz ,X Uz )=J2 M*P 4 l X v 4 . ^ ) ( 77 ) 



i=l 



Now by putting ( T73T > and ( TTTb together, we get 



s+i-l 2 

£ I(Y Vi+1 ;X Ui \X Vi ) > 5^/(yp 4 ;JfvfW (78) 

s i—1 

> Z min 7(y n c;Xo|Xnc) (79) 

Now since in any i — 1 time frame there is at least one loop, therefore except at most a path of length L — 1 everything in 
X^^i 1 -^(^Vi+i ! X\Ai | -X" Vi ) - can be replaced with the value of the min-cut. Therefore, 

V 7(F Vi+I ;X Wi |X Vj ) > (K-L+ 1) min I(y n .; Xn|*n«) (80) 

i—1 

and hence the proof is complete. 



